Neighbourhood Thresholding for Projection-Based Motif Discovery
نویسندگان
چکیده
The PROJECTION algorithm by Buhler and Tompa is one of the best existing methods for solving hard motif discovery problems for monad motifs of fixed length l. In this paper we introduce the AGGREGATION algorithm, which like PROJECTION projects all l-mers from the given input sequences into buckets, but uses a different scheme for selecting buckets for subsequent refinement search. This new neighbourhood-based thresholding scheme allows AGGREGATION to discover motifs in biased background sequences that cannot be found by PROJECTION. In other cases, AGGREGATION finds motifs of the same quality as PROJECTION substantially more efficiently.
منابع مشابه
An Entropy-Based Position Projection Algorithm for Motif Discovery
Motif discovery problem is crucial for understanding the structure and function of gene expression. Over the past decades, many attempts using consensus and probability training model for motif finding are successful. However, the most existing motif discovery algorithms are still time-consuming or easily trapped in a local optimum. To overcome these shortcomings, in this paper, we propose an e...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملNeighbourhood detection and identification of spatio-temporal dynamical systems using a coarse-to-fine approach
A novel approach to the determination of the neighbourhood and the identification of spatio-temporal dynamical systems is investigated. It is shown that thresholding to convert the pattern to a binary pattern and then applying cellular automata (CA) neighbourhood detection methods can provide an initial estimate of the neighbourhood. A coupled map lattice model can then be identified using the ...
متن کاملHybrid Gibbs-sampling algorithm for challenging motif discovery: GibbsDST.
The difficulties of computational discovery of transcription factor binding sites (TFBS) are well represented by (l, d) planted motif challenge problems. Large d problems are difficult, particularly for profile-based motif discovery algorithms. Their local search in the profile space is apparently incompatible with subtle motifs and large mutational distances between the motif occurrences. Here...
متن کاملA Combined Model and a Varied Gibbs Sampling Algorithm Used for Motif Discovery
The conserved sequences in gene regulatory regions dominate gene regulation. Discovering these sequences and their functions is important in post genome era. A novel model is constructed to represent conserved motifs of DNA sequences. This model is a combination of PWM and WAM models. The advantage is the new model not only can comprise individual base frequencies in the motifs, but also can em...
متن کامل